Skip to content

mtmd : Support jinja in libmtmd (Only for QwenVL and Qwen Omni) #14730

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

alielmorsy
Copy link

That code is part of a private repo I’ve been working on. It provides essential support for Jinja in a multi-model setup.
The PR adds two new optional metadata fields for GGUF:

  1. tokenizer.ggml.image_token_id:For the image token, if it exists.
  2. tokenizer.ggml.audio_token_id: For the audio token, if it exists.

If these tokens do not exist, a fallback is used, similar to the FIM lookup. The current tokens used for images are <|IMAGE|> and <IMAGE>

For the MTMD tokenizer, I maintained backward compatibility and updated the split function to support multiple delimiters, allowing it to work with both the old marker and the preserved tokens.

One final change (only for Qwen models): I removed the image_start and image_end tokens as the model has its own special tokens already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant